Goto

Collaborating Authors

 point cloud representation


PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation

Neural Information Processing Systems

Domain Adaptation (DA) approaches achieved significant improvements in a wide range of machine learning and computer vision tasks (i.e., classification, detection, and segmentation). However, as far as we are aware, there are few methods yet to achieve domain adaptation directly on 3D point cloud data. The unique challenge of point cloud data lies in its abundant spatial geometric information, and the semantics of the whole object is contributed by including regional geometric structures. Specifically, most general-purpose DA methods that struggle for global feature alignment and ignore local geometric information are not suitable for 3D domain alignment. In this paper, we propose a novel 3D Domain Adaptation Network for point cloud data (PointDAN).



Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning

Hu, Chenyu, Li, Xiaotong, Zhu, Hao, Hou, Biao

arXiv.org Artificial Intelligence

Point cloud processing has become a cornerstone technology in many 3D vision tasks. However, arbitrary rotations introduce variations in point cloud orientations, posing a long-standing challenge for effective representation learning. The core of this issue is the disruption of the point cloud's intrinsic directional characteristics caused by rotational perturbations. Recent methods attempt to implicitly model rotational equivariance and invariance, preserving directional information and propagating it into deep semantic spaces. Yet, they often fall short of fully exploiting the multiscale directional nature of point clouds to enhance feature representations. To address this, we propose the Direction-Perceptive Vector Network (DiPVNet). At its core is an atomic dot-product operator that simultaneously encodes directional selectivity and rotation invariance--endowing the network with both rotational symmetry modeling and adaptive directional perception. At the local level, we introduce a Learnable Local Dot-Product (L2DP) Operator, which enables interactions between a center point and its neighbors to adaptively capture the non-uniform local structures of point clouds. At the global level, we leverage generalized harmonic analysis to prove that the dot-product between point clouds and spherical sampling vectors is equivalent to a direction-aware spherical Fourier transform (DASFT). This leads to the construction of a global directional response spectrum for modeling holistic directional structures. We rigorously prove the rotation invariance of both operators. Extensive experiments on challenging scenarios involving noise and large-angle rotations demonstrate that DiPVNet achieves state-of-the-art performance on point cloud classification and segmentation tasks. Our code is available at https://github.com/wxszreal0/DiPVNet.


A Multimodal Deep Learning Approach for White Matter Shape Prediction in Diffusion MRI Tractography

Lo, Yui, Chen, Yuqian, Liu, Dongnan, Zekelman, Leo, Rushmore, Jarrett, Rathi, Yogesh, Makris, Nikos, Golby, Alexandra J., Zhang, Fan, Cai, Weidong, O'Donnell, Lauren J.

arXiv.org Artificial Intelligence

Shape measures have emerged as promising descriptors of white matter tractography, offering complementary insights into anatomical variability and associations with cognitive and clinical phenotypes. However, conventional methods for computing shape measures are computationally expensive and time-consuming for large-scale datasets due to reliance on voxel-based representations. We propose Tract2Shape, a novel multimodal deep learning framework that leverages geometric (point cloud) and scalar (tabular) features to predict ten white matter tractography shape measures. To enhance model efficiency, we utilize a dimensionality reduction algorithm for the model to predict five primary shape components. The model is trained and evaluated on two independently acquired datasets, the HCP-YA dataset, and the PPMI dataset. We evaluate the performance of Tract2Shape by training and testing it on the HCP-YA dataset and comparing the results with state-of-the-art models. To further assess its robustness and generalization ability, we also test Tract2Shape on the unseen PPMI dataset. Tract2Shape outperforms SOTA deep learning models across all ten shape measures, achieving the highest average Pearson's r and the lowest nMSE on the HCP-YA dataset. The ablation study shows that both multimodal input and PCA contribute to performance gains. On the unseen testing PPMI dataset, Tract2Shape maintains a high Pearson's r and low nMSE, demonstrating strong generalizability in cross-dataset evaluation. Tract2Shape enables fast, accurate, and generalizable prediction of white matter shape measures from tractography data, supporting scalable analysis across datasets. This framework lays a promising foundation for future large-scale white matter shape analysis.


Reviews: PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation

Neural Information Processing Systems

Originality: - L3: "to the best of our knowledge, there is no method yet to achieve domain adaptation on 3D data, especially point cloud data" see below [SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud, Wu et al 2018] proposes a domain adaptation pipeline for 3D lidar point cloud to reduce distribution gap between synthetic and real data. Technically the above two are operating in image space (depth semantic segmentation maps, and BEV of point cloud, respectively) but the underlying goal is still to model 3D information from point cloud. The first paper is from 2018 so I do think this paper over claims this'first to do domain adaptation in 3D data' statement a bit. Although it's worth noting that this paper explores point based representation rather than image based, and for classification task rather than point segmentation. But I think the similarity and different should be mentioned and discussed.


TractShapeNet: Efficient Multi-Shape Learning with 3D Tractography Point Clouds

Lo, Yui, Chen, Yuqian, Liu, Dongnan, Legarreta, Jon Haitz, Zekelman, Leo, Zhang, Fan, Rushmore, Jarrett, Rathi, Yogesh, Makris, Nikos, Golby, Alexandra J., Cai, Weidong, O'Donnell, Lauren J.

arXiv.org Artificial Intelligence

Brain imaging studies have demonstrated that diffusion MRI tractography geometric shape descriptors can inform the study of the brain's white matter pathways and their relationship to brain function. In this work, we investigate the possibility of utilizing a deep learning model to compute shape measures of the brain's white matter connections. We introduce a novel framework, TractShapeNet, that leverages a point cloud representation of tractography to compute five shape measures: length, span, volume, total surface area, and irregularity. We assess the performance of the method on a large dataset including 1,065 healthy young adults. Experiments for shape measure computation demonstrate that our proposed TractShapeNet outperforms other point-cloud-based neural network models in both the Pearson correlation coefficient and normalized error metrics. We compare the inference runtime results with the conventional shape computation tool DSI-Studio. Our results demonstrate that a deep learning approach enables faster and more efficient shape-measure computation. We also conduct experiments on two downstream language cognition prediction tasks, showing that shape measures from TractShapeNet perform similarly to those computed by DSI-Studio.


PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation

Neural Information Processing Systems

Domain Adaptation (DA) approaches achieved significant improvements in a wide range of machine learning and computer vision tasks (i.e., classification, detection, and segmentation). However, as far as we are aware, there are few methods yet to achieve domain adaptation directly on 3D point cloud data. The unique challenge of point cloud data lies in its abundant spatial geometric information, and the semantics of the whole object is contributed by including regional geometric structures. Specifically, most general-purpose DA methods that struggle for global feature alignment and ignore local geometric information are not suitable for 3D domain alignment. In this paper, we propose a novel 3D Domain Adaptation Network for point cloud data (PointDAN).


AdaFold: Adapting Folding Trajectories of Cloths via Feedback-loop Manipulation

Longhini, Alberta, Welle, Michael C., Erickson, Zackory, Kragic, Danica

arXiv.org Artificial Intelligence

AdaFold extracts a particle-based representation of cloth from RGB-D images and feeds back the representation to a model predictive control to re-plan folding trajectory at every time-step. A key component of AdaFold that enables feedback-loop manipulation is the use of semantic descriptors extracted from geometric features. These descriptors enhance the particle representation of the cloth to distinguish between ambiguous point clouds of differently folded cloths. Our experiments demonstrate AdaFold's ability to adapt folding trajectories to cloths with varying physical properties and generalize from simulated training to real-world execution.


MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding

Yu, Hai-Tao, Song, Mofei

arXiv.org Artificial Intelligence

In perception, multiple sensory information is integrated to map visual information from 2D views onto 3D objects, which is beneficial for understanding in 3D environments. But in terms of a single 2D view rendered from different angles, only limited partial information can be provided.The richness and value of Multi-view 2D information can provide superior self-supervised signals for 3D objects. In this paper, we propose a novel self-supervised point cloud representation learning method, MM-Point, which is driven by intra-modal and inter-modal similarity objectives. The core of MM-Point lies in the Multi-modal interaction and transmission between 3D objects and multiple 2D views at the same time. In order to more effectively simultaneously perform the consistent cross-modal objective of 2D multi-view information based on contrastive learning, we further propose Multi-MLP and Multi-level Augmentation strategies. Through carefully designed transformation strategies, we further learn Multi-level invariance in 2D Multi-views. MM-Point demonstrates state-of-the-art (SOTA) performance in various downstream tasks. For instance, it achieves a peak accuracy of 92.4% on the synthetic dataset ModelNet40, and a top accuracy of 87.8% on the real-world dataset ScanObjectNN, comparable to fully supervised methods. Additionally, we demonstrate its effectiveness in tasks such as few-shot classification, 3D part segmentation and 3D semantic segmentation.


Generative Design of Crystal Structures by Point Cloud Representations and Diffusion Model

Li, Zhelin, Mrad, Rami, Jiao, Runxian, Huang, Guan, Shan, Jun, Chu, Shibing, Chen, Yuanping

arXiv.org Artificial Intelligence

Efficiently generating energetically stable crystal structures has long been a challenge in material design, primarily due to the immense arrangement of atoms in a crystal lattice. To facilitate the discovery of stable material, we present a framework for the generation of synthesizable materials, leveraging a point cloud representation to encode intricate structural information. At the heart of this framework lies the introduction of a diffusion model as its foundational pillar. To gauge the efficacy of our approach, we employ it to reconstruct input structures from our training datasets, rigorously validating its high reconstruction performance. Furthermore, we demonstrate the profound potential of Point Cloud-Based Crystal Diffusion (PCCD) by generating entirely new materials, emphasizing their synthesizability. Our research stands as a noteworthy contribution to the advancement of materials design and synthesis through the cutting-edge avenue of generative design instead of the conventional substitution or experience-based discovery.